DOCUMENT RESDNE 



ED 078 086 



TM 002 90S 



AUTHOR 
TITLE 

INSTITUTION 
SPONS AGENCY 



PUB DATE 

GRANT 

NOTE 



O'Leary^ K. Daniel 

The Effects of Observer Bias in Field- Experimental 
Settings. Final Report. 

State Univ. of New York, Stony Brook. Dept. of 
Psychology. 

National Center for Educational Research and 
DevelOEWfient (OHEW/OE) , Washington, D.C. Regional 
Research Program. 
Mar 73 

OEG-2-71^0017 
23p. 



EDRS PRICE 
DESCRIPTORS 



MF-S0.65 HC-$3.29 

Attitudes; Behavior Patterns; Children; ♦Data 
Collection; Evaluation Criteria; Expectation; ♦Field 
Studies; ♦Hypothesis Testing; *Observation; 
Questionnaires; Research Methodology; Tape 
Recordings; Technical Reports 



ABSTRACT 

A series of studies on observer biases revealed that 
simply informing observers of experimental hypotheses does not 
produce observational data consonant with those hypotheses. However^ 
questionnaire responses following an experiment with different 
induced expectations does produce global data consonant with 
experimental hypotheses. In addition, if the observers are informed 
of the experinrental hypotheses and the investigator provides daily 
feedback to the observers indicating how well their data support his 
hypotheses, the observers will report data consonant with those 
hypotheses* The method of investigation in the studies reported * 
involved having obsezvers watch specially prepared video tapes of 
children who exhibited significant ctmounts of disruptive behavior. 
Following a p re-treatment or baseline period, observers were then 
asked to watch video tapes on which children displayed no change or 
marked reductions in disruptive behavior duri ng a ••treatment period* •• 
While observer biases per se did not result in confounded data in any 
of the studies, an unanticipated problem of observer drift or 
changing observational criteria can result in seriously confounded 
data where groups of observers initially trained together are later 
assigned to different treatment conditions. (Author/CK) 



FILMED FROM BEST AVAILABLE COPY 



oo 
o 

o 
uu 



■ina''. Repor 



Grant 'S!o. OEG-2- 71-001' 



U S OEPARTMENTOF HEALTH 
EOUCATiON & WELFARE 
NATIONAL INSTITUTE OF 
EOUCATION 

TMiS OOCUArtENT MAS BEEN REPRO 
OUCEO EXACTtV AS RECEIVED FROM 
THE PERSON OR ORGANIZATION ORIGIN 
ATING 17 POINTS OF VlEA OR OPINIONS 
S'ATEO 00 NOT NECESSARItV kEPRE 
SENT OFF JCI At NATIONAL INSTITUTE OF 
EOUCATION POSITION OR POLICY 



The Effects of Observer Bias ir Field -Experimental Settings 



in 

o 

Oi 

o 
© 



Daniel O'Leary 

Psychology Department 
State University of New York 
Stony Brook* ?vew York 11790 

March, 1973 



The research repvorted herein was perforrneo pursuant to Grant 
OEG-2-7i-0017 frr-m the U.S. Office of Eduizatic-n, Department 
of Health, Education and Welfare. Contractors undertaking 
such projects under government sponsorship are encouraged 
to express freely their professional judgment in the conduci 
of the project* Points of view or opinions stated do not, 
therefore, necessarily reflect official office of Education 
position or policy. 



•U.S. Department of Health, Education and Welfore 

Office of Education 
Regional Research Program 



Table of Contents 



Introduction 

Procedures and Results 

A. The Effects of Observer Bias 

8- The Reactive Nature of Reliability A^so.ssnient 

Conclusions 

Bibliography 

Appendix: Observational Code for Disruptive Behavior 



Abstract 



A ser3.es ot studies oa observer biases revealed that 
simply informing observers of experimental hypotheses does 
not produce observational data consonant with' ;-hose hvpo- • 
theses. fiowever, questionnaire responses foj lovinq an 

"^^""^ '^Ufferent induced expectations doel produce 
giocaJ. dara consonant with experimental hvpotb*-ses in 

hv^r--hi^-:/fJ^fv°'''^^''''^'''. ^""^ i^-onned of the experimental 
2P ? " and the investigator provides daily feedback to 
the observers indicating how xveil their data support his 
^l?.°J'T''^U observers will report data consonant with 
tho;.e hypotheses, -rne method of investigation in the stud- 
ies reported involved having observers watch specially pre- 
pareaviaeo tapes of children who exhibited significant 
amounts of disruptive behavior. Following a pre-treatment 
or baseline period, observers were then asked to watch 
vioeo .apes on which children displayed no change or marked 
eductions m disruptive behavior during a "treatment per- 
ioc._ While observer biases per se did not result in con- 
founaed oara m any of the studies, an unanticipated pro- 
blem of observer drift or changing observational criteria 
can result xn seriously confounded data where groups of 
observers initially trained together are later alsigned to 
different treatment conditions. Similarly, experimenters 
can inadvertently shape data consonant with their experi- 
^S^^iJ^r^^!^^^.'^^^^^ ""^^y the observers of the 

h^Tw^T? 9ive them feedback regarding 

how well their data conform with that exoectaiiion. 



In t rod 11 c t i o n 



^Fhe 1:0 :r \ HXiCo .;c c*''c..: •* ^^1'. r.:-^ z rA" -".rrvers and 

experiment c-rr> .cas spiMiv; uoc^r. psycl' //i :x J. ^ld fashion by 
Rosenthal in 1563, r.-o since l.hat time v^: ;cctanc^' studies 
have assumed a p-si- inv of /isrutc. pr'.'.::T;intncc ir various 
circles* m a prototyp:-- ^>pcri^;e^. r fp,c -nth.*^! s< Foc^^, 
1963), naiv'^ lynts v;ere rcundo^nlv asrign c l'- twj groups of 
experinxenters i.n a ma^ie leavninv^- st adv. One grc 'p of expf-r- 
imenters (undergraduates) were tolc- that ehey v/eio ts-^sting 
maz e -r; r i gh t a n i^iua 1 s and the o th er group o expe \ I me n t e r s 
were trold that they were testing maze-dull aaimals. Sxpex--- 
imenterij who vere x.old rhat r.heir animals wj?re bright re- 
ported faster learning T.iines for their eninals then the 
experimenters who wor^ :old that their aniraais were Ina^e 
dull. Rosenth^;! extorided a variant : f thi-^ work to clc^ss--- 
rrc-n i^oiLtinqs v/here ho infomied toaoh.%'rs tha'- oeri ain ran-- 
c:-nay ^electied chiidrcn in their c}-\ci:i weri:- ^purters 'i,e. 
"laue i:^loom?irs v/ith unrealized ao?i'"v^;\o v?r:rial) ♦ On 
the basi:> o£ pre- and posv-tei-^ ing in J-he and spring 

it was found rhat children in the f^-or-rimeT.tai group, i.e., 
spurters, had a greater increase i:. IQ than did the controls 
{Rosenthal & JaccDsen, 1966). Part, of the heat generated 
from the Rosenthal studies is of course duf? z:- the possibil- 
ity tho t all psychological experimentation involving an 
inforraec hiAJT\an observer could be confounded by th'^ observer 
bias, however, another erjually important reason for such 
heat' is due to the failure of many people: lo replicate 
Rosenthal i 's work in both the laborc.tory and thaVlassroom 
(Barber & Silver, 1968; r]airborn, 1969). Dosp.te the 
failures 'co rc-plicate and extenijiv? crxticiBm.^^ cf Rosenthal* 
methodology^ (Snow, 1963; 'IhorndiKo, l<^ec) , Investigators 
applying learning principles to the modiric:ir lOn of behavior 
have taken note of the expectancy phenomenon. For example, 
Thomas, Becker, and Armstrong (1068) noted ':hat their 
observers were not informed of changes in ox'oerimental con- 
ditions. McKensie, Clark, Wolf, Kothera, and Benson (1968) 
noted that their observers were not informed of tlie type 
of home based token procedure utilised in rheir srudy nor 
were they informed when the token progrc.m was put into eff- 
ect. In order to "control for any bias in ratings, O'Conn- 
or (1968) kept observers unaware of arpignment of subjects 
to various treatnienf or controi conditions and each obser- 
ver watched a random combination of traated and control 
subjects. Bushell, Wrobel, and Michaelis (1968) had class- 
room observers record behavior descriptions which wore later 
coded as study or non-study behavior. As they stated, "A 
description might nave been coded Koi-Study on Cay 15 and 
'Study*' on Day 19 simply because the observ^^r expected study 
behavior to increar^^e during the final contingent (or treat- 
ment phase). Conseqaently, Bushell et. al*, trained new 
coders who had no knowledge of the details of the original 
investigations. Despite these precautions and the more 



•ystematic coding check of Bushell et al., there has been 
only one study dealing wiih- the effects of ol server bias 
in the classroom studies where the biases or expectations 
were independently manipulated (Kass & O^Leary, 1970). As 
a side issue in a study several years ago Scott, Purton, 
and Yarrow (1967) did find a significant difference in the 
observations of or^e infonr^ed observer and a group of unin- 
formed observers—using posit ive and negative acts as de- 
pendent measures. That is, the inrormed observer's records 
were more confirming of the experimental hypothesis than 
•were those of the uninformed observers. However, the Scott, 
Burton, and Yarrow study us^d only one informed ojcser\7er 
(the senior author) and the dependent measures were rather 
global? e.g* / positive acts included suggestions, sharing 
ideas, helping, showing concern for otners. and carrying on 
friendly conversation. In addition, many of the subcateg- 
ories were quite unreliable with reliabilities ranging from 
--•09 to •80. More importantly, even though experimenters 
may not inform their observers of the informal hypotheses, 
the observers may easily become aware of any experimental 
changes For example, in a study hy Madsen, Becker, and 
Hiomas (1968) where teachers were told to praise appropriate 
behavior and ignore disruptive behavior, the otservers were 
not informed of the experimental conditions, but the inves-- 
tigators noted that the changes we^e often dramatic enough 
that observer comments clearly reflected programmed changes 
in the teacher's behavior. Furthermore, when a treatment 
condition is in effect, the expe^^imenter or graduate assis-- 
tant may subtly or even overtly reinforce the observer for 
bringing him '"good" or confirming data with comments like 
"That's really interesting," "That teacher is having some 
effect on those kids," — or more openly — "That treatment 
almost never fails to produce an increase in appropriate 
behavior," Because of the principal investigator's involve-- 
ment with token reinforcement studies (0*Leary & Becker, 
19&7r O^Leary, Becker, Evans, & Saudargas, 1959), and the 
near impossibility of deceiving the observers about the 
onset and intended experimental effects of a token program 
the prorlem of observer bias has been a particularly press- 
ing problem in his research and that of others similarly 
aware :>f the expectancy problem and despite admonitions to 
them carefully monitor their own behavior in this regard, 
several observe s have reported that although they were . 
aware of the problem, their results might still be biased 
because of their knowledge of the hypotheses of the s tudy. 

Kass and O'Leary (1970) systematically manipulated 
predictions of treatment effects for three groups of observ- 
ers who recorded the behavior of children from videotapes of 
a classroom setting. Two groups of observers were told, 
respectively, that level of disruptive behavior from "base- 
line" to treatment" phases of the study a) would increase 



2 



and b) would decrease* The third group was given no pre- 
diction or results. In fact/ all groups of observers 
viewed the same video tapes which were selected, on the 
basis of a _priori ratings, to show a substantial decrease 
from baseline to trearmcnt* Significarit effects associated 
with the main treatment iT.anipulct^i-rn vK^re obtaineci among 
the three groups or five out of :iine categories. P-ecause 
of differences a:nong the three groups found curing baseline, 
an analysis of covariance was conducted. This analysis re- 
vealed that after adjusting for initial differences during 
baseline, the three groups still showed significant effects 
on four of the five categories • Visual inspection of the 
ordering of the three groups means on these four categories 
revealed that on two of the four the differences clearly 
were not iu line with the predictions* As will be seen 
later, the differences obtained in this study may have been 
largely due to observer drift — or random fluctuations in 
the observational criteria within the three groups* 



erIc 



3 



( 



Procedures and Resultf. 



A. The Effsctsj of Obser\'-er Bia& 

A do::toral dissertation v;as designed by Kent, 1972, to 
assess the effects of knowledge of predicted results by 
observers on behavioral recordings generated under circum- 
stances similar to f ield-experir.en ta5 invest taat ions in 
behavior modification. Tne exp er intent a 1 variables were: 
predicted behavior change from baseMne to fcreatnient con- 
dition (decrease vs. no change); actual behavior change 
(decrease vs. no change); and expectation induction (prior 
to baseline vs. subsequent to baseline). An observational 
code developed to mcaaure the disruptive behavior of child- 
ren in a classroom was employed. Tlie categories of behavior 
comprising this code were: out of chair, modified out of 
chair, touching other's propertv, vocalization, p.1aying, 
orient inq, noise, aggression and time off task. 



Forty observers were trained as a group f seventeen 




ERIC 



by "treat. men t" videotapes demonstrating either decrease or 
no change from oaseline levels of disroot i-.-e behavior. 
Eight "basGline" and eight "treatment" ratings were obtained 
on eacn of tv/o target children from the five observers in 
each experimental group. 

An analysis of behavioral recordings of "pre-baseline" 
videotapes revealed greater than five per cent significant 
differences among the experimental groups on the nine behav- 
ioral cat<.-gorlc.s prior to the experimental manipulations. 
Analysis of "pre-baseline" and "baseline" recordings of 
four groups which received no experimental intervention un- 
til immediately prior to viewing the "treatment" videotapes 
revealed: &) greater than five per cent significant diff- 
erences among the experimental groups; b) a greater number 
of significant differences among "baseline" than "pre-base- 
Ime" recordings and c) virtually no similaaity between 
particular differences wlaich existed in "pre-baseline" and 
"baseline" recordings. Under this circumstance, the diff- 
erences xn behavioral recordings as a function of the exper- 
imental :nan5pulation were completely and inextricably con- 
founded with differences which evolved spontaneously among 
the experimental groups. Tliat is, groups of observers tend 
to_"ar:.it" or randomly modify their definitions of the beh- 
avioral code. 

Tnc problem of ol- server drift was completely unanticip- 
ated and this drift may have accounted for the differences 
attributed to expectation in the Kass and O'Leary, 1970, 
study. TVie groups of observers ^In the Kass and O'Leary stud 



were trained separately and were later assigned to separate 
expectation groups* In fact, the pretreatment differences 
of iCass and O'Leary (1970) and K(?nt (1972) aro new clearly 
attribui.ablci to tliit- obnc-rvat icnal cL-.lf-. 

Because rr; i\ px— .h;^.>r. ^^t r> St;r*'^ r '• •ilf^ '^r/ainod, 
Kent, O^Leary, Dlame^nt, c-jd Di^ Lr \y;2) siarxd a study 
tn re--examine the cftec^.^ i^L predictec' reiiults (expectation," 
on the observational reccr«UngH of trained oh^cTvers* This 
study was specially de.n^cjned to avoid the pos.sibility that 
differential "drift'* \n definition of rhc behavior ccJe in 
the experimental condiricns woulc o; confr unded wit-h rhe 
effects of predicted re3ul\:s. Videotapes ^ i' children in* a 
clasf^rooro during **baselinr^ " and 'treatrnunn" coixUt ions were 
rated by two groups of observers, employi^^c a sr.andard 
nine category behaviora; -ode for (^iaruptivf> behavior • T!he 
two groups of 10 observ-ers were told chat they were viev;ing, 
respectively, the effects of a) a token program which 
would dramatically reduce the level of disruptive behavior 
from baseline and b) a control program which would produce 
no change from baseline • In fact, the same videotapes were 
viewed by ail observers, A prior i ratings of a pool of 
videotapes were utilized to create "baseline" and "treatmenl 
conditions which were matched for level of disruprive be- 
havior. However, after each '•treatment" recording period, 
observers in each group v;ere told that a casual examination 
of their recordings indicated that the predicted results 
were emerging. This was intended to increase the similarity 
to field settings in which such casual feedback may often 
be given, and to enhance the liklihood that biases due to 
predicted results would occur. This design a»ed, within 
each experimental condition, • pairs of observers who 
computed reliability only wit: pairs. Thus "drift" 
among the five pairs of observ. who were told they would 
view the effects of a token procedure, and among the five 
pairs of observers who were told they would view a control 
procedure, could be separated froT the effects of predicted 
results on behavioral recordings. Following the final ex- 
j>erimental session, both groups wcire given a questionnaire 
to detemine whether observers understood the results pre- 
dicted for their group. In addition, observers were asked 
what they anticipated and what they perceived as the result« 
of the experimental condition they"* vi ewed* Finally, all ^ 
observers were asked if they felt .ney had been misinformed 
about any aspect of the study. 

Global evaluations of treatment effects obtained on a 
post questionnaire were significantly affected by predicted 
results but behavioral recordings were not. That is, al- 
though there were no differences in the actual frequencies 
of behavior recorded in the two experimental groups, when 
observers were asked "What actually happened to the level 
of disruptive behavior from the baseline to the treatment 



5 



ERiC 



condition, *• they reported data consonant- vith the experiment- 
al hypotheses* Nine of the ten ohservers for whcnn a de- 
crease in level of disruptive behavior frort b^8eline to 
treatment conditions vaa predic^:ed rey crtec' actually view- 
ing a decrease* S^-ven zi tTie cen ^-bsc^verr for whom no 
change was predicted ropcrted v.owihg nr» chariqe. 

T\^"iile no obi.eT*^''3 1 loniil d if ier?ac»^: werv. ^^^tained in the 
Kent et. al. (1S72) udy that cculd atlriouted tn in- 
duced expectations, it. was still possible that induceo ex- 
pectations coml^ined wi uh experliaen tcr feedback i ndicinting 
how well the observational data fit with his predictions 
would rtfsu]t in biased data, Conrjcqucrt ]y, a study designed 
to shape dai.a consonant w\th experimenter hypcthes'.s v;as 
'/ concucted by O'Leary, Kenu^ and Kanowit?:, ?972. 

Pour undergraduate females watched specially i^rppared 
videotapes supposedly rcrpresenting baseline and trcattTi*?nt 
conditions in a classroom for enctioraliy ^.''isturbed cr^iid- 
ren« In fact, however, rhere were tio differences in rates 
of disruptive behavior in the two conditions (the baseline 
and treatment) • The si:udy was presented to the observers 
as an investigation dcfsigned to replica-ve some earlier 
research on token programs in which only roiuxorced bt^haviors 
in the token prograci decreased. It was statrd that other 
behaviors not reinforced would presumably not change. 

After insuring adequat-: reliabilit Les in a pre-baseline 
condition, the four obsr-rverB watchoc tour baseline tapes 
and then four pseudo treatm<£t;t tapes, v ne expt-r iir.en tear gave 
the four students the specific expectot ionf^ regarding the 
outcome of the experiment, and in addition during the pseudo 
treatment condition, this experimenter cave the students 
positive or negative feedback regarding how well their data 
conformed to the experimental hyp'^chei^es. Mr>rr specifically/ 
he shaped their data recording by giving posit v-*-^: foedbajk 
to the observers only if their data conformed v-;ch the 
experimental hypotheses. That ie, he made positive comments 
like, *'TwO data really seems to b»r reflecting the treatment 
change" or negative comments like, strcinge tliAt you hav 

so many diaruptive behaviors — th^s tr^^c.tment usually works. 

•Jhe observers* data were converter *:o d i trt-^rence scores 
between a group of four well trained obseivex^ (criterion 
observers) and themselves. These differences during base- 
line and treatment were then subject no an analysis of var- 
iance which allowed us to separate effects on the categories 
predicted to decrease and those predicted not to decrease. 
The data clearly supported the proposition that one can 
shape data consonant with one's experimental hypotheses if/ 
in addition, the observers are informed of that expectation. 
On two categories of behavior predicted to change, the 
changes were reported by the observers — despite the fact that 



6 



ERLC 



there were v.o 3 m aa: ch< /:aes * i co- icr as recorded by 

the criterion o>;i?rrverH* Jr. .:)rv.>:ast, on tnose categories 
not preclicted tr chanac* uo c:;«a' -70 vas r^^port-od by the 
observers • 

The Reactive ,vai .ue of Rel^;ibili^y Assessment 

As a result of tho Kass ano .'l^-.t-y (1970) and Kent 
(1972) studies, :.x, bccajne appeirc- : Jhe.t otservcrs may drift 
naturally in thoir orservational ::ritor\a when different 
o^^ferver -groups arc assigned t> c}5^^^r * e t.>.: same plienomena. 
Consequently, a study was conducted r\ R''r\z^czyk, Kent, 
Diament, and O'Leary (in press) to assess whonher observers 
would modify their recordings to match reliability checker* 
who adopted d if ferine? observational criteria. Throughout 
a study, two reliability checkers employed a unique inodified 
version of our standard observational code. Four of the 
nine categories of the behavioral rating code were modified 
to produce stable but differential observational criteria fo 
the two assessors. This manipulation was intr-rded to in^ 
crease the detectability of matr^lng by t)u: ci i-^ervers of the 
different observational cril-eri£i :^inplcyeci by l i-h assessor* 
As a rc-suit of these mcxTif i cat ions / tho code cir.ployed by 
Assessor I produced a higher freqae-cy than thr^* code employo 
by Assessor II on two categories: vocalizatior and noise* 
In employing the modified code, A:3SGSsor I would record 
even the softest vocalizations and also any '*inouthing«" the 
child might make as vocalizations, while Assessor II would 
record only the louder vocalizations and ignore such behavio 
as humming, whispering, and sighing. Further, the behavior- 
al code was mociified so that Ass^^gsor ir would record a 
greater frequency than Assessor J two other behaviors: 
playincj and orienting. It wan requ:rf?d that these differen- 
tial observational criteria be .Sufficiently weli-d«fined 
that the assessors would be reliable with each other at a 
moderate level and that this level f reliability between 
assessors nor vary across experimental condijicnc. In 
short, an artificial difference was createo between the rat- 
ings of Assessor I and Assessor II. 

For two and one half weeks prior to the experiment each 
assessor employed his respective version of the mjdified 
code and on regular but different occassions, computed total 
reliability (for modified and unmodified categories combined 
with each observer. Reliability was computed for five ob- 
servers a median of four times (range 2-4). These reliabil- 
ity computations provided the only opportunity for observers 
to note the unique observational criteria being employed 
by the two assessors. At no time, however, did either 
reliability assessor make any statement that overtly coa-* 
trasted his rating criteria with those of the other assess*- 
or. 



A 



which reliability (hr:^.*..^ was r%rasu-tig reliabiiiLy pro- 
duced a substantia f ;^ - i^^^^^rvat: oral or:.eria. 5^U8, 
obsorver:^ adjusted t /-r rating- criteri?. cs l funotir.n of 
the feedback they rcr3;i>?ed, l^liat is, obr?.— vers adopt 
idiosyp.crcitic rating criteria in order to roetch the obser- 
vational criterii^ of their reliability checker • 



ERLC 



8 



Induced c-xr^e^j^-avi : ror ae '.ail-ja - mflwence the 
recordings of d i.svapt clasj-'rc^or b-T'na lor undergraduate 
observers, Howe /er. , ^.hf: .i; :*uo.^(' •t-xofect * : : .s did influence 
global evaluation:^ • i: -uch chc.'K^i rv^^rs . ^i^at 

is, despl"*:e the /acv cia"- ./bservo' - :*^c' , dage rhe:.r 
recordings to be consonar. : ^^i^h • i imc . t:d let j on?*^ 
when asked whar their rec-^r^lir^gs .WeC • ' "vinq the 
study, they reported that thvy ha ' recox'' - ;: ..avior^s w>iich 
rcfic-cted treatment changes. 

When experimental expectaric.r.s wer-; ^^nibired v/ith shap- 
ing of data consonant with expcrLmental hypo^neses^ observer 
recordings were ma^'kedly influenced. The iinullcat ions of th 
data were unequiv'ocal : one should not pr — irie daily evalua- 
tive feedback regarding the extent to vhw: cl-'serr^'-er 's 
'"c cordings are reflecting the expected I reatment change. 

An unanticipated problem of observ^^- vir-irt v;as Incount- 
ered in this research which compiereV- . r^- v.nc^o-J the re- 
ruLtrs of one expectation study ^Kent c; ' ^'^^. ^y, 1973} and 
which rendered a different interpreLa:- i - : earlier 
st'ady by Kass (1371). The observer dri:. • to a random 

flv:ctuation in the observational criteria u.-.'^ by groups of 
cVservers assigned to different treatraeat co;:ditions* The 
phenomena of drift was clearly docunrenred by O'Leary and 
Kent^ 1973. It appears ^hat the pri- cess of. coiiiputing rel-- 
lability and discussing differences in recording rnodifiea 
an observer's interpreration or the behrr ioral crjdr- to inor€? 
clr.sely match thxose observers v;ith whom i.e \s ^/cx*king. When 
observei;8 are di -^ided into differ -^nt groups, dif'.erent mod- 
ifications of the observational » ^ie n-ay eiriv-rcc. Trtf-se 
HKvi i.ti cat ions app<rar \:r> have a r.:i'V?o»ii cf:f.;o' r : il^ta. gener- 
ar.ed anc: must be differentiated f- n pos^*: .1-^ 5;y:.^teraatlc 
)>:ases dae to observer expectat:i 3r ^ * Tlir ir pMc;.. Prions of 
the observer drift problem are very seri(;us c.v.r x**T^ary 
and Kent have suggested various ways ::Lceai- 'o. with tho 
problem- 

Tliese data suggest that it is unwise to confound indivi- 
dual observers or groups of observers wirh <?iJforent experi-- 
mental conditions. However^ even in singl e v >^P within- 
subject designs, there exists tr>e possibility Lh^r observers 
may "drift" in their application of a behavioral code, 
yielding data recorded during one exotirimenta ^ condition 
incomparable to data recorded during a yubseq* ^r/;. condition. 
Montrose Wolf (personal communication, 1972) has suggested 
a procedure of training a new group of observers several 
weeks after the initiation of a study and assessing their 
comparability to observers who have been collecting data in 
this setting. Wien no differences are found between the two 



ERIC 



groups ^1 oosecverc , it ..^ o.-ear -har has not occurrec?. 

Kcyvevar, in rho absence * Bi.cV v/.r rr^f-.t i , i?: r.?Lerris pru- 
dent tc :.akr one sev-:ir..ii ':^?:<:ps ■ .e \,i.o confounding ob- 
server drift with dif feront laj rreatzr.t^. - inrrrventions. In 
between-subject deaiq::S, orie ca\:*:c employ ?\ i:-^ngle group of 
observers ro record aata frovn ail treatrent groups. Alter- 
nately, several groups of observers could be rotated period- 
ically f r ,m one treatment group to anocher. Clearly neither 
of these procedures guarantees that the recordings from a 
particular experimental condition will represent comparable 
applications of the behavioral code at any two points in 
time. This procedure does assure, however, that rhe data 
from each treatment group will be equally affected by any 
modifications in the behavioral code which do occur/ 

In within-subject designs, the critical comparisons 
involve one experimental condition instituted at one time 
and another condition instituted subsequently. Assuming 
that observer drift is a random phenomenon, one might employ 
a ntunber of independent observer groups across all experi- 
mental oorditions. For example, if experimental conditions 
each lasr.id a week or longer, different observers or differ- 
ent groups of observers could be employed on each day or the 
weic\. ^ Drift among group.:; ^—^la thus ado to the variation of 
data from each condition, but would no- disrort comparisons 
cf one condition to another. An altern^^o procedure would 
involve videotaping the behavior of int^irest during all 
experimental conditions and showing these recordings to 
observers in random order. When this is impractical, observ- 
ation of videotapes of a sample of behavior from each exper- 
imental condition would provide a measure of the veridical- 
ity of behavioral recording obtained in vivo across time* 



ERLC 



10 



Barber, X., & Silver, Fact, fiction, rnd the exper- 

imenter bias etfect, Psychological Bu ll etin Monograp h er 

1968, 25./ No. 6, Par- II, 1-^29. 

Busfxell, D., Wrobel. A,, & Michael is, M, L, .\pplying 

"group** contingencies tc the rlassrooio study behavior 
of pre~-school children. Jourrial of Applied Behavior 
^L^::^i£^ 1968. 1, 55-C3. 

Clairborn, W. Expectancy effects in die classroom: A 

failure to replicate • Journal of Educ3Lic?ial Psychol- 
ogy . 1969, 60^, 377-583. 

Kass^ £. ?!ie effects of obser\^ar bias in field-experi- 
mental settings. Unpublished mastor's thesis. State 
University of New York ai: Stony Brook. 

Kass. S: O'Leary, D. ^he effects of observer bias 

in f ield--expex'imrntal settings. Paper presented at a 
Swposiian "Behavior Analysis in Education, " University 
of Kansas, Lawrence, Kansas, April 9, 1970. 

Kent/ Expectation bias in Behavioral Observation* 

Unpublished doctoral dissertation. State University of 
New York at Stony Brook, 1972, 

Madsen, C. H., Becker, C, & Thomas, D. Rules^ praise 

and ignoring; Elements of elenientC'ry classroom control 
Jou rnal of Applied Behavior Analvsia . 1968. 1., 139--151, 

McKenr,ie, H. S*, Clark, xMw Wolf, M. M, , Kbthera, R. , & 
Benson, Behavior modification of children with 

learnincr disabilities using grades iis allowances and 
tokens as back^-up reinforcers. Exceptional Children ^ 
1963, 34^ 745-752, 

O'Connor, R. Modification of social withdrawal through 

HyEnl:)olic modeling. Journal of Appl i>- d Behavior Analysi: 

1969, 2, 15-22. 

O^Lear^y, K, D* Becker, Behavior , edification of em* 

adjustt:(ent cla«s: A token reinforceatent program. 
Except i onal Children , 1967, 32^ 637-042. 

O'Leary, K. J?., Becker, W. C*, Evans, M, £: Saudargas, 

A* h token reinforcement program in a public school 
A replication and systematic analysis^. Jcrurnal of 
Applied Behavior Analysis , 1969, 2^, 3-13/ 

O'Leary, Kent, V,, & Katuwitz^ J* Shaping data 

consonant with experimental hypotheses* (Jni^bllsh^d 

11 



manuscript, State Unlvers:ty of NTew York at Stony 
Brook, 1 972 . 



O^Leary, D, , cS. Kent, v;. Bcfnavicr Mc.;o - cation for 

Social Action; Rt-^c-airch raccics and Pr.blems. Chapter 
to appear in L. HaiTiarlynck, er. • ol. (Eos }. Critical 
Issues in Research and ?ractiov , ch an pa i gn , Illinois, 
Rescjarch Press, February, 197 3, 

Romancyzk, R. G., Kent, R. , Diament, C- & 0*Leary, K. D. 
Measuring the Reliability of Observational Data: A 
Reactive Process. Journal of Applied Beh ei vior Analysis , 
in press. 

Rosenthal, R, , & Fode, K. The effects of .yjr'-rirnenter bias 
on performance of the albino rat. Ee>.r i oral Sc ience. 
1963, 8, 183-169, 

Rosenthal, R. , & Jacobsen, Lenoge. Teachc'^j expectancies: 
Determinants of pupils' K^gams. Ps / -.. .logical 
Report s, 1966, 1^, 1] 5^118. '^'"""^ 

Scott ,^ P. M., Burton, R, V., & Yarrow, M. R. Social rein- 
forcement under natural conditions. Chi Id Development . 
1967, 38, 53-63. 

Snow, R^ ?]. Unfiniv^hec pygmalion, C ontetyp orary Psycholoq\^, 
1969. jA, 197-199. ^ 

Thomas, D. R., R^^cker, . c., Sc Arm^z unq , Production 
and elimination of disruptive classroo;:i behavior by 
systematically varying teacher's behavior. Journal of 
Applied Behavi or Analysis , 1968, I, 35-4 5. 

Thorndike, R. L. P^-gmalion in the classroom: A review. 
Teacher Coll ege Record. 1969, 20, 805-807, 



ERLC 



12 



Observatiov:^5 ■ ':^.^e f^r Dlsruptiv behavior 



ut or Chair syinbol 0 

Purpose : Out of chair is intended.- irion: tor the 

gros.v rr-cror behavior of the child removing 
himself from his seat eiitiroly. \'fnen not 
p^-miict^ed / such behavior (e^g. , running 
arouT^d rhe room) may interfere with the 
chilcVs learning and is potentially dis-- 
tracting to others. 

De scription t ObsrrvaDle movement cf ^:he c'^u Id from his 
chair \vhen ncr p^,rrftitted or requested hy 
r.eacher. None of '-h'.' ^ ild's v;eighr is to 
be s-ppcrtied by Ihe c'haw, bat th-: child 
Ln ohv^ical contucu wich chair. 



Critical Non^- of the child's weight is be suppor 

Point s: ed V// zl'w^ chair. 

includes : Child is leaning on desk and has either 

_ ^^^^ contact vith the chair or none of 

his weighs: is actually ^^eing supported by 
the chair. ^ . ' 

Tirae limits on the fr.^lowinq beginning 
wi1 h teacher's permi.i'^ion. Allow 15 
seconds for a child t ) ge.: from the tea- 
cherts desk tc his ow.i. Al.ow 15 second 
for a child to return to hi^? ov/n seat 
after completing a task (i*e., placing a 
vord card on the wall)- Pencil sharpen- 
ing - Ih mins. Gefcti-ng a drink - 1^ 
mins 1 fountain in t\y in) ' Getting a 
book - l-'2 mm^n. {tis- ^ Inuit starts froir 
the second *:hat the gets -^'^^t of 

Sv.it). Goin<] lo th^ ^anhcooiti: (A) 2 
min, limit, ^ ^l) 30 limit beginninc 

when the chiJcl leavei:^ bathroom, 

Note: If the child returns to the chaii 
after Ih (or 2 mins., where appli 
cuble), but d'aring the 10 sec. 
inter-interval period, the "0" 
V7ill be recorded in the 20 sec. 
interval just prior to the 10 sec 
interva 1 ♦ 

Going to get a reading book during a math 
lesson. When a child is full standing am 
the back of legs touch chair, or child is 



fully standing and ii^ rcuching back of 
chair with hant^s. Going to teacher •s 
desk wh.r-n not p^rrmitled. ^n:^rowing away 
papers. Strotohxng (li 'rhi-lci actually 
l^v^.ves '.rat ) . 

Fxoludes; Pclrieva\ of an accidenv-a lly dropped task- 
related object. Leanincj for'^ard to pick 
up an object ev^^n if all contact with the 
chair is momentarily lost, providing 'the 
child is not standing fully erect on feet. 
In clude if child begins crawling around on 
floor after retrieving object, also, in- 
clude if child is mw ing from desk in a 
crouched position, so as not r.o let the 
teacher Bee hiT., etc. 



2) Modified out of chair symbol e 

Purpose : Modified out of chair is intended to moni- 

tor less intense motor behavior than dis- 
played in out of chair, and behavior which 
is usually only distracting for the child 
himself rather than others- 



Description : Movem=^nt of child from his chair, with some 
oi his weight still being supported by the 



cnaxr , 



Critical The child is still at his desk and some of 

Points: his weight is being supported by the chair 

Includes : Leaning forward to pick up an object even 
if all contact with the chair is momentar- 
ily lost, providing the child is not stand-^ 
ing fully erect on feet. Bouncing in 
chair, e.g., in responding excitedly to 
some event. Kneeling on chair. Sitting 
on back of chair. Both feet on or in desk 
Lying across chair horix:onta 1 ly . Standing 
near dC'sk with one foot on che chair. 

Excludes : When child is fully stands ag and the back 
of le^gs touch chair. Sitting on one or 
both feot. one "cheek" off chair. 




3) Touching other's property symbol T 

Purpose : Touchinq is intended to monitor behavior 

which is distracting to the child and very 
often to others when the child comes into 
contact with the personal property of 
another . 

14 



Ijc^r% ^-ript i on : Chi Id 



Crmical 
Points : 



prope,r"y without ^.^^crwLssiiC^n to ci j so* 



he c'h i I c does not ha 



iT^^ssion for 



his action and not chat his action may or 
may not cosult in an alteration or post 
hoc penrmission . 



Includes : 



Excludes : 



Grabbing, ' 

cxi '.*;:siori of hanc t;. 
prC'pf.r Ly * r.ano: bruj?:. in..; 

if tliis act is i nccmpatT^'': •? wl:/n learning 
(i.^-., the cViild is aitwendinq to the act). 
Touching deslc of another, v/n other other 



arrangi'iq . der- trry tng tne 
nri^-^-hex"- Ur^xty.j mr ccrial object 
:\ others ^ 
?.th,ers ' desk 



person is seated 'in it 



not (this in-- 



:.':^»Lcner ' s desk 
aro li-i v/aiting : 
- ci:"y"rc idle 

teacher * s 



ERLC 



eludes teacher's desk). Resting elbows 
on desk behind if this act is incompatible 
wiLA learning or annoy 5f O.ie other child. 

Toucnino others on the back ^;r any i-^art of 
the body or clothing* Use of shared poss- 
essions SAich as ruler?. > erar^ers, art mat- 
erials. Kibow resting on another's desk 
ot hanci brushing against ilw it the desks 
are together and neighbor is not disturbed 
and s-ach an act in not incompatible with 
loarninq. Walking past a desk, chair, etc, 
and accidentaliy brnshmq or touching the 
desk/ chair, ot.c. , i.e^, child is not at- 
tending zo the behavior. 
Note: When chil^^ is at 
vjJ-J : pe nri i r. i ■ , 
be holped/ do ; \ C'':'^x:i 
touch i Jig of objc-d e or: 
desk. Touching shoy. id oe scored, 
if the teachnr specifically in- 
structs child to stop and child 
continues or if child is instruct- 
ed to perform :::oine task at desk 
and then begirs t:o rouc^.h objects 
on desk. 



.cali^^^.tion symbol •/ 

Purpose: Vocalization is intended to monitor verbal 

behavior which is usually distracting to 
both the child and to others. 

Description ; For the sake of consistency, any^ aud ible 
non-permitted vocalization is t;o be re- 
corded even though in the opinion of the 
observer it did not "seem" disruptive. Anj 



15 



ERIC 



non-;-.^ .Tfiitt :.o "audible '* Leiiavior emanating 
from 'tovAh . 

Critical Tr i> o:>i,*<zYver must at^tual [\ '^v^ar the vocal- 

Points ; 1/..^ [- i . Iiiferencej;^ arc no t: acceptable 

ox :'ept as noted telow. 

Includes ; If vocalization is obvio:-s, but can't be 

heard (obvious - if another child responds), 
Ans^voring \vithout j;:>eing called on. Moan- 
ing. Yawning. Any noise made with mouth 
when eating - unless chile has permission 
to eat. Any vocalizatior :aadG in response 
to the disruptive behav'i'>r of another 
child, e.<7., telling ano-hcr child to re- 
turn :3to len ar t icle, cry inq in response to 
aggression committed to his person or 
possossions, etc., if the child has not 
rt'ceived pe^rmission specifically from the 
tea.Ther to speaks Whisp^^^ring, belching, 
cryL.v.5, shouting, "operart * coughs or 
sueozes * 

£>:cludes ; Vocalize t . on in responses L^^ teacher's 

queS' loi'* Sneezing* Autoraatic coughing. 
N'Lwt;: Once a child is recognized by the 
t e ache-r , voca liica 1 on is not 
?^vcc>red / regardless content of 
the vocalizat ion : crying, yelling, 
otc. , until the teacher specific- 
ally instructs thri child ro stop. 

Playing symbol P 

Purpose; Flaying i3 int^'cied to monicor often subtle 

manipulative b<cihavior ^hat is di^tx^aching 
to the child and poi^sibiy al^o distracting 
to others. 

De scription : Cliild u^^es his hands to plv-^y with his own 
or community property, s.^ that such be- 
hc.vlor is incompatible (<,.r would be incom- 
patiV^le) ^rt^ith learning. 

• Critical Child usei^ his hands to manipulate his own 
Points : o';. community property. 

Includes ; Playing with toy car when assignment is 
spelling. Playing with comb or pocket 
book. Eating only when the hands are being 
used - chewing gum is not rated as P un-» 
less child touches or manipulates it with 
O ^ his hands ^ Poking holes in workbook. 



16 



Cleaning nail a with pencil- Drawing on 
self* Manit>u] ating pencil in ^uch a man- 

n(?r <■-:? to make the behavi >r incpmpat ible 
wlrh j^rarni;:c, f;.g. rhrv .nr pencil back 
a:;:: : r:. ? i^. cr. .''c^>** ^ ^-vTicil through 
air c ' :-4n air>>Iw.ao. Pi :}:.r.7 rcabs, nails, 
or n'- s-^ if t'/:e desir^-'C 'o^yject" is ^=epar- 
at'^Ct fr' -r;. f h:? body a;v i .p^'ia ter? . Look- 
lag Ir.to desk an,.>, iviOving :./;ms, but. -"loes 
ncl: cor.->t} out v;ith a ta^.'^ -j"cluted -ect. 
Working with or r-aoiny non-task rt-.iated 
material, c»-g-, r::adin^ P<3<i?e 25 whon told 
to r^ad pace 1, doing vcazh when told to do 
spelling, etc* 

}23ccl udos : ToucMng others* propc.'>r . 'flaying with 
own clcthf^.s* 

Note: Tnciudo i5 arti'^lc* i3 rtraoved 

r.tjttons, scarx, etc-, ano n» .n- 
ipulat ed . 

Lifting desk or chair vith feet (rate K if 
this creates audible noiee). Randoin bang- 
ing of pe^ncil on desk (rate N if audible). 
Simple twiddling pencil if it is not seen 
as being incompatible with learning. 
Note; Rate twiddling pencil, banging 
i-., ncil, or putring pencil in 
n/'uth, hair, behind ear, etc., if 
ci-jild attends tc such btzhavior an< 
ccas;.-s attendinc t.' assigned task. 
Operational cef 1 n*' t i-^.n of attend- 
ing; child eid^er ioc»>f at man- 
ipulatc^d obje^-t or begins to man- 
ipulate objecl: in n.^n-oc andom pat- 
terns for more than 5 seconds* 
Picking scabs, nails, or nose if the de- 
sired "object" is not separate from the 
body. 

Orienting Response — - symbol § 

Purpose : Orienting ir> intended tc monitor the gross 

motor bchc^vior of turning: round from the 
designated poin^' cf reference. Such be- 
havior is distracting to ^hild since it 
usually precludes attending to assigned 
task, and lb often distracting to others. 

Description ; Child turns more than 90 degrees from poin 
of reference while seated. 



ERLC 



17 



critical The child must be in hi» seat? he may be 

_ Points : in a modified position r and orienting 

Includes both ^hc? horizo.ntal and vertical 

Inc ludes: Turnir..:- tc t"- . c.. h :> br^-^nr]. Looking to 
the rea-: r: ^h<: i: v;., Tarr.ing around in 
chair .r turning *^hc-lr around. Leaning 
back ii. ciirj^i'* .rvr- , 'l.an ' degrees* 

.\:;te: ?c>ir, u ::'r reference is typically 

child *8 desk, bar may be the tea- 
cher if -.ho childrea are directed 
to atterKi to her* If child 
shoul'i turn de^jk at some angle, 
point: of reference becomes where 
desk was oriqi n a Ily , not to where 
the child has mcvc.^ it. Also/ 
the child chin should be u^sed 
as the indicator ? f hov/ far he 
has turned- Th.^^- Tore v:>rjenting 
i^; rated when child's chin has 
turrnr^c irore than ^sO di^g trees from 
point ol reference-. 

Excludes : Orienting during class discuss ione when the 
teacher directs (eith implicitly or (explic- 
itly) the class to attend to a child's 
explication of ?n answer. Orienting while 
picking up a task relat e d object. When 
child is in corner or otherwise out of his 
chair. 

Noise symbol U 

P urpose : Noise is intended to monitc;r the frequency 

of distracting sounds produced by the chil< 
oth^^r than vocalisation* 

Dose r;ipv ion t Child is creatiriq any audible noiso, with- 
out permission, (/:her than vocali?ration. 
For the sake of consistency^ anZ q'^^c? ible 
sound is to be reoordeci even xhough In 
the observer * s opinion :.t did not "Sc^em" 
disruptive. 

Critical The observer must actua lly hear the sound 
Points ; to rate it* Inferences are not accepteible 

In c l udes ; Turning pages in an ex<ujgc-i rated manner, 
producing ncibe* Moving desk around. 
Pencil tapping. Hangirig of a?iy o^-ject. 
Fishing in desk without coming out with 
anything or coiaing out vLth an inapprop- 
riate object (if noise actually made in 

18 



ERIC 



the process). Shuffling feet more than 
once each way. Any noi«o made while get- 
ting ou^ of chair without permiseion. In 

crenorr, I, c>ny noi^c^-^ made* conjunction with 

>:r,/ c ^r^jipt \vf^ b!".aviv > , y.cf. -.ny noise 
Tu^dr v.her. oh\I^: * r^- v.-: a ,r.-K . ^ cthor 
obioc ' :ii ar. >t h*.- " (A; . 

obj'^icr (book or p-^nci.' Pv^sMng -^r. r^.-r 
back and fort;i c du:. j a :)onnii:t-?d 
act (e.g,^ to go: tas^k-rcla^ocl object). 

Aggr-Bsion syTnboI A 

Purpose : To r».:;a3ure the highly 3isrupi v-'^t^ behavior 

physical assavilts. 

Deso ription : Child makes an inr.er.so ni-jvA-'mci.it di'^scred 
at onother person sc aj^ coriCi Ir.to con- 
tac\ with him, eith(^r -^xr .^cil\^ . r by using 
a material object as a^ e<tensiiMi of ?:he 
hand . 

Cri+ioal Iniiontion is to be rec/in.^d rat.h?^r ^han 

Polirvis : just accuracy of assault, --u?.rT., agq.Tes- 

sio/i is recorded if child thn,»\/t' pen::;! or 
swings at another, regard i whether 

or rot the pencil or motion h:-.i5 the 
Chi 'd.). 

Includes ; r locking CT.i^t^^-s with ar?TiE or ])ody from at- 
taining goa^ (e.g., vhilc w.\l):j.rig up aiale), 
Tripui ng. Kicking • Throwing . 

_E>: eludes: Brushing against another {inoiUde if ac- 
tion is contiriially rep-at:ed so as to 
tease or annoy) • 

Time-off- task symbol X 

Purpos e : Time-of f-task is intended to monitor non- 

attending benavior, rhat/ if oxcessive^ 
is detrimental to child's performance. 

Description t Child does not do assigned work for entire 
20 second interval. 

Critical Child makes no attending response for the 
Points : entire 20 se-ond interval. Child must 

only attend, i.e., "looking at, his work. 
Inferences that, "he isn't really think- 
ing about it,*' are not acceptable. 



19 



Include s ; Child does not write when so assigned. 

Child cloes n?t: read wher. assigned. 
CliPd ^ v; rki: cr on inappropr - ate material 
'V^., :»n :\'Ut) :;:rir,r rpr-J I i ^^.7. 

: - ext. ^ t '-C in no- 

hrlT. Xv-hen firi&'^i-^d 

:r.< Twiy sit 6 al* cc^^.^ 
1: enrire interval, vnien in corner^ 
Chi Id 't' head mast be within a 45 degvee 
angle from the r-: -nor ivmed by 2 walls 
(i.e., Li hip head is facing crithrr of the 
2 walls dirontjy, for a 20 St?;.:/'n:.*: period # 
he wcul:-: n^ted X). 



x." rk ? 
^ ask. 



etc. Day- 
vorking. 
additional 
h assigned 
begins to 



Excludes ; Child has his "vand raised to rsk questions^ 
Child is told he may cease working if he 
so desires. 



10) No inappropriate behavior as defined by the above 
categories symbol - 



20 



